Class imbalance classification is a challenging research problem in datamining and machine learning, as most of the real-life datasets are oftenimbalanced in nature. Existing learning algorithms maximise the classificationaccuracy by correctly classifying the majority class, but misclassify theminority class. However, the minority class instances are representing theconcept with greater interest than the majority class instances in real-lifeapplications. Recently, several techniques based on sampling methods(under-sampling of the majority class and over-sampling the minority class),cost-sensitive learning methods, and ensemble learning have been used in theliterature for classifying imbalanced datasets. In this paper, we introduce anew clustering-based under-sampling approach with boosting (AdaBoost)algorithm, called CUSBoost, for effective imbalanced classification. Theproposed algorithm provides an alternative to RUSBoost (random under-samplingwith AdaBoost) and SMOTEBoost (synthetic minority over-sampling with AdaBoost)algorithms. We evaluated the performance of CUSBoost algorithm with thestate-of-the-art methods based on ensemble learning like AdaBoost, RUSBoost,SMOTEBoost on 13 imbalance binary and multi-class datasets with variousimbalance ratios. The experimental results show that the CUSBoost is apromising and effective approach for dealing with highly imbalanced datasets.
展开▼